Fractal Characteristic-Based Endpoint Detection for Whispered Speech
نویسندگان
چکیده
In this paper, a fractal based approach is proposed to detect endpoints in whispered speech. The underlying principle is based on the fact that whispered speech is sufficiently chaotic and thus can be analyzed using fractal theory. Due to the different scope of fractal dimensions of silence, noise and speech segment, speech/non-speech segment could be determined from that with a simple decision scheme. The results of experiments, which have been performed on two databases and different methods, show the efficiency of the proposed whispered speech segmentation algorithm. Key-Words: Endpoint detection; Fractal; Fractal Dimension; Robust; Whispered speech
منابع مشابه
Significance of Parametric Spectral Ratio Methods in Detection and Recognition of Whispered Speech
In this article the significance of a new parametric spectral ratio method that can be used to detect whispered speech segments within normally phonated speech is described. Adaptation methods based on the maximum likelihood linear regression (MLLR) are then used to realize a mismatched train-test style speech recognition system. This proposed parametric spectral ratio method computes a ratio s...
متن کاملA whispered Mandarin corpus for speech technology applications
Whispered speech is a natural mode of speech in which voicing is absent – its acoustics differ significantly from normally spoken speech or so-called neutral speech, such that it is challenging to use only neutral speech to build speech processing and automatic recognition systems that can deal effectively with whisper. At the same time, humans can naturally produce and perceive whispered speec...
متن کاملUnsupervised detection of whispered speech in the presence of normal phonation
The results of an investigation into unsupervised detection of whispered speech segments in the presence of normally phonated speech are presented. The Whispered Speech Detection system presented here extracts features which exploit both waveform energy and periodicity. Unsupervised classification of these features was performed to identify and label long segments (approx. 2 2.5 seconds) of whi...
متن کاملAcoustic analysis and feature transformation from neutral to whisper for speaker identification within whispered speech audio streams
Whispered speech is an alternative speech production mode from neutral speech, which is used by talkers intentionally in natural conversational scenarios to protect privacy and to avoid certain content from being overheard or made public. Due to the profound differences between whispered and neutral speech in vocal excitation and vocal tract function, the performance of automatic speaker identi...
متن کاملTransfer Learning with Bottleneck Feature Networks for Whispered Speech Recognition
Previous work on whispered speech recognition has shown that acoustic models (AM) trained on whispered speech can somewhat classify unwhispered (neutral) speech sounds, but not vice versa. In fact, AMs trained purely on neutral speech completely fail to recognize whispered speech. Meanwhile, recipes used to train neutral AMs will work just as well for whispered speech, but such methods require ...
متن کامل